By Ken Coar, Jun 29, 2000, 16 :47 UTC
Maxwell's Demon and Hat Colour
"Long ago and far away
Maxwell felt the need one day
For a Demon, scarce as high
As the atoms going by.
Over heat he gave it sway,
Making warmth go either way
From the vector Nature gave.
Maxwell's Demon, come and save!"
-- Christopher Stasheff, Her Majesty's
Wizard
Chances are that your Web site has at least a few pages
that you really don't want published to the Internet at large. How do
you keep the Black Hats from seeing them, whilst not impeding the access
of the White Hats that need the pages?
What Apache Security Won't Help
At the time I'm writing this (February 2000), there's a
lot of current-events news about major Web sites being taken down temporarily
by denial-of-service (DoS) attacks. The specific attack type in question
cannot be stopped by Apache, even though it may be aimed at the
Web site. Apache is just a software application running on the system;
these attacks are aimed at the systems themselves. As someone has pointed
out, "If you have 1GB/s heading for your server then the pipe is
going to saturate before Apache even gets a chance to see the packets."
But for less extreme cases, Apache's implementation of the
Web security mechanisms, when properly implemented, should be more than
adequate to protect your sensitive pages from exposure.
Assumptions in This Article
For the rest of this article, I'm going to make the following
assumptions:
- your Apache source tree starts at
./apache-1.3/
- your Apache ServerRoot is
/usr/local/web/apache
- your Apache DocumentRoot is
/usr/local/web/htdocs
- the username under which Apache runs (the value of the
User
directive in your httpd.conf file) is nobody
All of the cd and other shell commands in this
article that refer to directories use these locations.
Mandatory versus Discretionary Access Control
There are two basic types of access control: those that
verify who you say you are, and those that verify who you really
are. The three basic verification methods are to check
- what you have,
- what you know, or
- what you are
or even some combination of these. In common non-computer
usage, an example of the 'what you have' method would be having the key
to a padlock; you can get in if you do. 'What you know' is the method
used to keep other people out of your account; if they don't know
your password, tough luck for them. And 'what you are' is coming into
prominent play in criminal investigations, as DNA patterns are admitted
as evidence.
The best security systems use a combination. Your bank's
teller machines, for instance, use a combination of the first two methods:
you need to have the ATM card, and know the PIN associated
with the card (or the account).
But what's all this noise about 'discretionary' and 'mandatory,'
you ask? Put simply, discretionary control (DAC) mechanisms check the
validity of the credentials given them at the discretion of the user,
and mandatory access controls (MAC) validate aspects that the user cannot
control. For instance, anyone can tell you its username and password and
you can then log in with them; which username and password you supply
is at your discretion, and the system can't tell you apart from the real
owner. Your DNA is something you can't change, though, and a control
system that only allowed access to your pattern would never work for anyone
else -- and you couldn't pretend to be someone else, either. This makes
such a system a mandatory (also called non-discretionary)
access control system.
In Web terms, and Apache terms in particular, discretionary
controls are based on usernames and passwords, and mandatory controls
are based on things like the IP address of the requesting client.
Another way to keep discretionary versus non-discretionary
controls straight is to think about the way failures are handled: if you
fail a discretionary check (such as if you misspell your password), you
get another chance -- but if a mandatory check fails, you get a 'forbidden'
error rather than 'not authorised,' and there's no way to say "give me
another chance" without starting from scratch and requesting the page
again as though for the first time. And unless something's changed on
the server, even retrying isn't going to make a difference; you'll still
be locked out.
Authentication versus Authorisation
Authentication is the process of verifying that credentials
are correct -- that is, that the username is in the database and the password
is correct for the username. Authorisation is the process of checking
to see if a validated client is permitted to access a particular resource.
For instance, Bob may have correctly supplied his username and password,
but still not be able to access Jane's file because she hasn't included
him in the authorisation list for it.
In Apache, almost all of the security-related modules (see
a later section for a list) actually do both. The main
feature that distinguishes them from each other is their authentication
aspect; mostly, they let you store the valid credential information in
one format or another. mod_auth , for instance, looks in normal
text files for the username and password info, and mod_auth_dbm
looks in a DBM database for it. They handle the authorisation side of
their task in essentially identical ways, however.
The security modules are passed the information about what
authentication databases to use via directives, such as AuthUserFile
or AuthDBMGroupFile . The resource being protected is determined
from the placement of the directives in the configuration files; in this
example:
<Directory /home/johnson/public_html>
<Files foo.bar>
AuthName "Foo for Thought"
AuthType Basic
AuthUserFile /home/johnson/foo.htpasswd
Require valid-user
</Files>
</Directory>
the resource being protected is "any file named foo.bar",
in the /home/johnson/public_html directory or anywhere underneath
it. Likewise, the identification of which credentials are authorised
to access foo.bar is stated by the directives -- in this
case, any user with valid credentials in the /home/johnson/foo.htpasswd
file can access it.
Realms: Areas of Controlled Access
In terms of discretionary control mechanisms on the Web,
each protected area, whether it be a single document or an entire server,
is called a realm. When a server challenges a client for credentials,
it provides the name of the realm so the client can figure out which credentials
to send.
The name of a realm is specified in the Apache configuration
files with the AuthName directive, which takes a single argument:
the name of the realm.
Note: In older versions of Apache, the entire remainder
of the line following the "AuthName " keyword was taken
to be the realm name. This caused problems when someone embedded a quotation
mark (") in the string, since in the actual HTTP protocol the realm
name is quoted. So more recent versions of Apache accept only a single
argument to the directive; if you want to use multiple words, like "This
is my realm", you need to enclose the entire string within quotation
marks so that it will look like a single 'word.'
Realm names are implicitly qualified by the URI to which
they apply, and subordinate URIs are implicitly part of the same realm.
This means that if <URL:http://foo.com/a/ > is in realm
"Augh", then <URL:http://foo.com/a/b/c/foo.html > is
also in realm "Augh" unless it's been overridden.
The implicit qualification also means that even if <URL:http://foo.com/a/foo.html >
and <URL:http://foo.com/b/foo.html > are declared in
two separate statements as being in realm "Foo", they're actually two
different realms named "Foo". The only way they'd both be in the
same "Foo" realm is if they had a common ancestor that was (such as <URL:http://foo.com/ >).
The qualification rules will cause the client to prompt
for credentials whenever it requests a document in a realm it hasn't visited
before -- even if it's visited a different realm with the same name.
There is no default for the AuthName directive,
except what might be inherited from an upper-level directory.
The Client/Server Authentication Handshake
When a client first attempts to access a document that's
under some sort of discretionary access control, a lot goes on behind
the scenes that the end-user probably never sees. Since on the first attempt
the client won't know that the resource is protected, it won't include
any credentials. When the server receives the request, it will go through
all the phases of access checking; when the credentials (none)
don't match any that are valid for the resource, the server will return
a 'not authorised' status.
In almost all cases, a client that receives such a 'not
authorised' response will realise that it didn't send any credentials,
and will pop up a dialogue box for the end-user to complete. This box
will display the name of the realm in which the document resides, and ask the user
for a username and password. Once obtained, the client will make the same
request again, only this time it will include the credentials. But as
far as the end-user is aware, that first request was completely invisible
and never happened.
If the client gets a 'not authorised' status in response
to a request that included credentials, it typically responds a little
differently: it will probably tell the user 'those credentials weren't
accepted, want to try again?' It didn't say that the first time because
it hadn't sent any.
In either case, if the end-user opts to not fill in the
dialogue and presses 'cancel,' the client typically just displays the
error page that the server sent along with the 'not authorised' status,
and goes back to waiting for instructions.
Apache Security Processing Phases
The preceding sections have been subtly leading up to this
topic. Apache handles all requests by running them through phases. Each
Apache module has an opportunity to deal with the request during each
of the phases, though most modules only do so for one or possibly two
of them.
Apache has three processing phases relating to security
checking. They occur in the following order, and are given the following
names:
access_checker -- This phase is where mandatory access
checks are applied, such as mod_access ' check for whether
the client's IP address is allowed to access the document or not
check_user_id -- This is the authentication phase,
during which a DAC module such as mod_auth checks the
user credentials to see if they're even in the database it's been
told to use
auth_checker -- This is the phase during which authorisation
occurs; modules like mod_auth check to see if the user
(who has already been authenticated) is allowed to access the document
Modules that impose discretionary access checks usually
participate in the latter two phases.
Basic Authentication versus Digest
Auth
How does the username and password get transmitted across
the network? Well, in early 2000 the answer is: not very well. It's not
that there are technical problems with the transmission; rather, the issues
are more philosphical.
There are currently two main methods of passing credentials,
called Basic authentication and Digest authentication. The
Digest method is considerably more secure, but unfortunately less widely
deployed -- so most authentication on the Web is done using the less-secure
Basic mechanism.
Basic authentication involves simply base64-encoding the
username and password and transmitting the result to the server. This
means that anyone who can intercept the transmission can determine the
username and password. Of course, this is only useful if those values
are valid and end up getting successfully authenticated. <grin>
Digest authentication transmits the information in a manner that cannot
be so easily decoded.
Since the username and password are so trivially protected
in the Basic authentication mechanism, the same authentication database
can be used to store user information for multiple realms. The Digest
mechanism, though, includes an encoding of the realm for which the credentials
are valid, so you must have a separate credentials database for each realm
using the Digest method.
When setting up discretionary controls in your Apache configuration,
remember that the AuthType directive is required.
The setting can be inherited from a higher-level directory or location,
but something must set the value to be inherited; there is no default.
Mixing Mandatory and Discretionary Controls -- The Satisfy
Directive
Sometimes you want to mix and match discretionary and non-discretionary
access controls, such as allowing anyone on the local network to see documents
freely, but requiring anyone else to enter a username and password.
This can be done with the Satisfy directive,
which takes a single keyword:
All
- In order to gain access to documents within the scope of a
Satisfy All
directive, a client must pass both any applicable non-discretionary
controls (such as Allow or Deny directives)
and any discretionary ones (like Require directives).
Any
- Documents within the scope of a
Satisfy Any directive
are accessible to any clients that either pass the non-discretionary
check (which occur first) or the discretionary ones
To illustrate, the following would permit any client on
the local network (IP addresses 10.*.*.*) to access the foo.html
page without let or hindrance, but require a username and password for
anyone else:
<Files foo.html>
Order Deny,Allow
Deny from All
Allow from 10.0.0.0/255.0.0.0
AuthName "Insiders Only"
AuthType Basic
AuthUserFile /usr/local/web/apache/.htpasswd-foo
Require valid-user
Satisfy Any
</Files>
Restricting by IP Address
Since the IP address is one of those aspects of a client-server
HTTP relationship that cannot be changed mid-stream, and cannot be easily
faked (without the cooperation of the intervening network systems), it's
considered a non-discretionary control. The Apache distribution includes
a module for limiting access thusly, called mod_access .
mod_access allows you to specify what domains
or addresses should or should not be allowed access, and in which order
the two lists (allowed and denied) should be evaluated. The basic syntax
of the Allow and Deny directives is
Allow from host-or-network
The host-or-network can be:
- a host or domain name (
www.foo.com ),
- an IP address (
10.0.72.3 ),
- an IP address and subnet mask (
10.0.0.0/255.0.0.0 ),
or
- an IP address and CIDR mask size (
10.73.128.0/18 )
Whenever possible you should use IP addresses instead of
domain names; using names means that the Apache server needs to do a double-reverse
lookup on them to make the translation to the IP address of the client.
(A double-reverse lookup, which is always done by Apache when dealing
with host names in security-related situations, involves translating the
name to an IP address, and then translating that IP address back to a
list of names. If the translations don't work in both directions, Apache
will consider the host/domain name match to have failed.)
As an added fillip, an alternate form of the Allow
and Deny directives, "from env=[!]envariable-name ",
allows you to make the go/no-go decision based upon the presence (or absence)
of an environment variable. The envariable may have been set for the entire
server environment, or it may have been set just for the current request
by a module such as mod_setenvif .
The Order directive controls how the cumulative
lists of Allow and Deny directives are interpreted.
If the order is Allow,Deny (note that no spaces are permitted
between the keywords!), then the initial state is the equivalent of Deny from All ,
the Allow conditions are processed, and then the Deny
list is. For Order Deny,Allow , the opposite is the case
-- the initial state is 'allow everyone,' then denials are handled, and
then the allows are used to override them.
The easy way to remember the default state is to recall
that it matches the last keyword: Deny,Allow means 'allowed,'
and Allow,Deny means 'denied.'
There is a third possibility for the Order
directive: mutual-failure . With this keyword, there is no
'default state' -- the only clients that will be allowed in are those
that don't appear on any Deny directive, but do
appear on at least one Allow directive.
Restricting by User Credentials
If you want to protect pages such that visitors need to
enter a username and password, the mod_auth module is your
tool. It is one of the simplest and easiest to use of the discretionary
control modules.
The key directives in establishing access controls are those
that define the location of the credential database and identify the authorised
users. For mod_auth , the directives in question are AuthUserFile
and Require . Other modules have similar directives.
The AuthUserFile directive simply takes a fully-specified
filename path (such as /home/foo/.htpasswd-foo ), which tells
the module where to find the text authentication file for the module to
use in the current realm. No path-shortening nor relative file specifications
are permitted.
The Require directive is actually part of the
core server rather than being specific to mod_auth , so it's
documented (however sparsely) at <URL:http://www.apache.org/docs/mod/core.html#require >.
Require is covered in more detail shortly.
Labeling
Different URLs within a realm can be protected in different
ways, with different sets of credentials being valid for different locations.
However, since the realm is the key the client uses to remember which
credentials to send, being egregious about using multiple sets of credentials
within the same realm tends to annoy users when they have to re-authenticate
repeatedly for what looks like (and in fact is) the same realm. It's generally
a good idea to have a one-to-one relationship between realms and sets
of authorised credentials.
But how do you turn on access control in the first
place? Just as you apply any other Apache directive: by having the directives
appear in the appropriate scope. For example:
<Directory /usr/local/web/htdocs/finance>
AuthName Finance
AuthType Basic
AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
Require valid-user
</Directory>
This will protect the finance subdirectory
and all files and subdirectories in it any below it. Other directories,
such as products , remain unaffected.
<Directory> containers are all very well,
but what if you want to protect only a single file? Or perhaps a document
that isn't mapped to the filesystem, like the output from mod_status ?
The answer remains the same: use the appropriate scoping directives (such
as <Files> and <Location> ) to apply
the security measures to the items you want protected.
Inheritance
Like almost all other Apache configuration details, the
security directives that apply to a particular document or directoy may
be inherited from the parent, or possibly even further up the tree. This
means that at each level you need only supply those directives that are
different. The following two fragments are equivalent:
<Directory /usr/local/web/htdocs/finance>
AuthName "Finance Department"
AuthType Basic
AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
Require valid-user
</Directory>
<Directory /usr/local/web/htdocs/finance/strategy>
AuthName "Finance Department"
AuthType Basic
AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
Require user susan bob
</Directory>
<Directory /usr/local/web/htdocs/finance>
AuthName "Finance Department"
AuthType Basic
AuthUserFile /usr/local/web/apache/auth/.htpasswd-finance
Require valid-user
</Directory>
<Directory /usr/local/web/htdocs/finance/strategy>
Require user susan bob
</Directory>
The second fragment takes advantage of the inheritance of
the values from the parent directory, and simply restricts the access
list to only Bob and Susan.
It's generally not a good idea to make too many assumptions
when dealing with security matters, so even though inheritance can seem
to make your life easier by not requiring you to duplicate directives
all over the place, this might be an illusion. Just wait until you see
how complicated your life becomes when all the inherited values become
compromised because of a single mistake at a higher level.
A related subject involves determining which of possibly
several access control modules has the Final Say on whether access is
granted or not. This is covered in a later section.
Requiring a Specific Username
Whereas the AuthUserFile directive and friends
tell Apache (and the security modules) where to find the authentication
databases, it's the Require directive that provides the instructions
on how to use them. If a scope doesn't include (or inherit) a Require
directive, then it isn't under discretionary access control regardless
of whatever other directives may be present.
Multiple occurrences of Require are cumulative;
each line gets added to the list of conditions. Whether processing stops
at the first matching condition or if all of them need to be met is up
to the module programmer; for mod_auth , for example, the
first match satisfies the condition for access, even if the configuration
contains something potentially confusing like:
AuthUserFile /home/foo/.htpasswd-foo
Require user foo
Require user bar
In this case (and in most cases, in fact), the intended
meaning is, "Require the username to be foo OR bar."
To avoid complicated configuration files when the access
list is large, there's a shortcut notation: "Require valid-user ".
This means, "any of the usernames in the authentication database can access
this realm." Obviously this won't work unless the database contains credentials
only for users allowed access; if there are any users in it which
aren't supposed to have access (such as might happen if you're
sharing a single database across multiple realms), you'll need to use
grouping or some other mechanism because the valid-user keyword
won't grind finely enough.
Even though the Require directive isn't specific
to any particular module, the syntax of the command is. That means that
there aren't any restrictions on the syntax; "Require candy-type caramel "
will be accepted, on the grounds that one of the security modules have
understand what it means.
Most of the discretionary control modules also provide support
for grouping users together, and granting access to groups rather than
individuals. This can be done (for mod_auth ) with the AuthGroupFile
directive. Like the user file, the group file simply contains lines of
text. Each line consists of a group name, a colon, and a list of comma-separated
usernames. When the username is decoded from the request credentials,
the module can look it up in the group file to see to which group(s) it
belongs. Here's an example group file:
board:annette,bill,james,gwynyth
finance:susan,steve,phoebe,zoe,bill_s
engineering:geekboy,lisa,melanie,george,j_johnson
To allow access by group, you simply change the Require
directive to something like this:
As with normal Unix users, a single username may belong
to multiple groups.
The Standard Apache Security Modules
Below is a list of the security-related modules that are
included as part of the standard Apache distribution.
mod_access
- This is the only module in the standard Apache distribution which
applies mandatory controls. It allows you to list hosts, domains,
and/or IP addresses or networks that are permitted or denied access
to documents.
-
-
mod_auth
-
This is the basis for most Apache security modules;
it uses ordinary text files for the authentication database. Entries
are of the form "username:password "; additional
fields may follow the password, separated from it by a colon, but
they're ignored.
-
mod_auth_db
-
This module is essentially the same as mod_auth ,
except that the authentication credentials are stored in a Berkeley
DB file format. The directives contain the additional letters "DB"
(e.g., AuthDBUserFile ).
-
mod_auth_dbm
-
Like mod_auth_db , save that credentials
are stored in a DBM file.
-
mod_auth_anon
-
This module mimics the behaviour of anonymous FTP;
rather than having a database of valid credentials, it recognises
a list of valid usernames (i.e., the way an FTP server recognises
ftp and anonymous ) and grants access to
any of those with essentially any passwords. This module is most useful
for logging access to resources and keeping robots out than it is
for actual access control.
-
mod_auth_digest
-
Whereas the other discretionary control modules suuplied
with Apache all support Basic authentication, mod_auth_digest
is currently the sole supporter of the Digest mechanism. It underwent
some serious revamping in 1999, and the new version is currently considered
'experimental,' but no problems have been identified with the new
code and it's likely to be moved back into the standard stable soon.
Like mod_auth , the credentials used by this module are
stored in a text file. Digest database files are managed with the
htdigest tool. Using mod_digest
is much more involved than setting up Basic authentication; please
see the module documentation
for details.
Allowing Users to Control Access to Their Own Documents
All of the security-related module directives can be used
in per-directory .htaccess files. However, in order
for Apache to pay attention to them, the directories in question need
to be within the scope of a AllowOverride directive that
includes the AuthConfig (for discretionary controls) or Limit
(for mandatory controls) keywords. For instance, a standard Linux installation
of Apache can enable this with the following lines in the httpd.conf
file:
<Directory /home/*/public_html>
AllowOverride AuthConfig Limit
</Directory>
Using Your System passwd File
This is a common request, and an incredibly bad idea: "How
can I use my system's /etc/passwd file as my Web authentication
database?"
The simple answer is: you don't. I'll just list a
couple of reasons:
- If someone manages to crack the username and password of someone
accessing a Web page, that person can now log onto your system. (Remember,
most of the Web authentication uses the Basic method, which is incredibly
simple to crack.)
- Unlike your system's login system, which will probably kick you
out, disconnect you, lock your account, or do something equally extroverted
and paranoid (and log the fact!) if you misspell your password
a few times in a row, there are no such controls on the Web. So someone
could very easily write a script that just banged away on your system,
trying endless combinations of usernames and passwords, and nothing
would automatically perk up and make rude noises.
If you still want to to it after reading the above
and the additional information in the Apache FAQ, well, on your own head
be it. You can do it with mod_access , and that's all I'm
going to say about it. And that's probably already too much, too.
Which Database is Authoritative?
What if you want to mix and match and have multiple types
of authentication database within a single realm? How does Apache figure
out which one to check first, and how does it know to consult another
if the first one fails to find the credentials?
The answer has to do with authoritativeness. Each of the
discretionary control modules includes a directive named something like
AuthAuthoritative . Each module's version of this directive
is named differently, so that it can be associated with that module and
no other, so we also have AuthDBAuthoritative , AutDBMAuthoritative ,
and Anonymous_Authoritative .
If a module is considered authoritative, then when Apache
gets a "I don't know this person" response, it won't look any further.
If the module isn't authoritative, the server can proceed to consult
another module.
Technical note: Actually, the decision isn't made by
the server itself. Each module knows whether or not it's authoritative
(based on the presence/absence/setting of its *Authoritative
directive), and so in the case of a failure it signals the stop/continue
answer to the server by returning either HTTP_UNAUTHORIZED
or DECLINED respectively.
By default, the modules tend to consider themselves authoritative
until you tell them otherwise, on the principle that it's better to be
safe than sorry. You can make this explicit with a AuthAuthoritative On
line, or allow responsibility sharing with AuthAuthoritative Off .
(Use the appropriate directive for the module in question!)
The htpasswd , htdigest , and dbmmanage
Utilities
These three utilities are considered 'user' tools, since
you don't need to be the Webmaster in order to use them to create access
control files for your own Web directory. As user applications, their
documentation is in the man/man1 subdirectory of your Apache
server installation; you can read it with a command such as:
% man /usr/local/web/apache/man/man1/htpasswd.1
Given the assumptions stated earlier, you should find all
three of these applications in the /usr/local/web/apache/bin/
directory, and the source of their man pages in /usr/local/web/apache/man/man1/ .
The htpasswd application is used to create
and maintain text-based authentication databases for use with the mod_auth
module. It gets the username and options from the command line, prompts
for and reads the password from standard input (twice, for verification),
and stores the username and the encrypted password in the specified text
file. When the Apache server receives credentials to verify, it encrypts
the submitted password using the same algorithm as the stored password,
and then compares the results -- so the actual plaintext password doesn't
live in a file on your system.
The syntax of the htpasswd command is:
htpasswd [options] pwfile username [password]
htpasswd can encrypt the passwords using a
variety of algorithms, indicated by the algorithm flag on the command
line:
-m
- Causes the password to be encrypted using an Apache-specific modified
MD5 hash algorithm. Although no other application can understand passwords
encrypted this way, they work on all Apache systems running
1.3.9 or later, and so you can transport your
.htpasswd
file from Linux to AIX to Solaris to Windows and have it work in each
place without any changes. This is the default algorithm for the Windows
and TPF platforms.
-
-
-d
-
Use the system's crypt() library routine
to encrypt the password. This means that the encrypted passwords will
be as safe as those in the system's user file -- but they're probably
not transportable to any other system.
-
-s
-
This will cause the password to be encrypted using
the SHA algorithm, which is used by Netscape servers. This is useful
when migrating from one server to the other.
-
-p
-
The -p flag means 'plaintext -- don't
encrypt the password at all.' This was added because of a problem
in Apache 1.3.6 on Windows, which prevented MD5-encrypted passwords
(the only other type supported on Windows by that version) from being
correctly recognised. Don't use this option unless you're working
with a password file for Apache 1.3.6 on Windows. Even then the
vastly preferred remedy is to upgrade to a more recent version; 1.3.6
is from early 1999.
The encryption algorithm used is particular to each entry
in the file, so it's entirely possible for a file to contain passwords
encrypted in different ways.
The htpasswd tool understands two other flags,
which control other aspects than encryption:
-b
- Get the password from the command line rather than reading it from
stdin . This flag is primarily intended to help Windows
Webmasters, but it's useful on other platforms as well, as it allows
script-based password management in a non-interactive environment
(such as allowing a user to change is password with a CGI script).
However, since the password appears in plaintext on the command line,
it might be visible to another user in the output of a ps
command, and there's no verification that it was spelt correctly.
Use this option with caution.
-
-
-c
-
By default, htpasswd assumes that the
pwfile authentication database file already exists,
and will update it. To create a new one, or completely overwrite an
existing one, add the -c flag to the command line.
The htdigest and dbmmanage tools,
also in the /usr/local/web/apache/bin/ directory, are similar
to the htpasswd application. htdigest allows
you to maintain text database files for use with Digest authentication,
and dbmmanage supports the DB, DBM, GDBM, and NDBM database
formats. dbmmanage is a Perl script, so you will need to
have the Perl interpreter (version 5 or later) installed on your system
in order to use it.
Location of Your Authentication Database
Remember that one of the main things the Apache Web server
does is serve up files to visitors from the Internet -- and don't put
your authentication database files anyplace where that could happen to
them!
For server-wide database files (that is, those managed by
the Webmaster and listed in the httpd.conf file, rather than
in user's .htaccess files), make sure you put them someplace
where they're not under the DocumentRoot. Also make sure you don't
put them someplace where they're under an Alias ed or ScriptAlias ed
directory.
For access control used by individual users to protect their
own documents, the database files should not be under the directory
listed in the UserDir directive in the server's httpd.conf
file (typically public_html ). Having your users put their
database files in their home directory, or in another subdirectory (other
than under public_html !) is a good idea.
Recent versions of Apache (those newer than 1.3.4 or so)
include a default limitation on the common filenames used for per-directory
authentication databases:
<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>
This will prevent the server from processing requests for
files named .htpasswd , .htaccess , .htpasswd-foo.db ,
and so on. Note that if you upgraded your Apache server from an
earlier version, your httpd.conf file may not include these
lines, and you may want to add them yourself.
Frequently-Asked Apache Security Questions
I've tried to address most of the common questions about
Apache's security mechanisms that keep cropping up, but here are a couple
I didn't cover (but which are still common):
- Q:
- How do I invalidate credentials? Someone has logged in to a protected
page, but now wants to 'log out' so no-one else can use his browser
window to access the page without logging in again. How do I make
his browser forget the credentials that worked the first time?
- A:
- The simplest way is to redirect the client to a script that always
returns a '401 Unauthorised' status, no matter what. That tells the
client its credentials are invalid, so it should throw them away.
To make this work, the script needs to be in the realm for which the
credentials are being invalidated. The big disadvantage to this method
is that the default client behaviour on getting a 401 status is to
ask the user for new credentials -- so it's not a seamless operation.
For a truly invisible invalidation of credentials, you need to remove
them from the authentication database -- which means the user won't
be able to log back in again. {sigh} It's not an easy thing to do;
read the various discussions about it on the
www-talk
mailing list archives at the W3C.
-
-
Q:
-
How can I use the dbmmanage tool to
manage an AuthDBMGroupFile database?
-
A:
-
In a word, you can't. At some point in the Apache
1.3 development cycle, the dbmmanage script was altered
in such a way that it can now only deal with user files, and not with
group files any more. This is a known deficiency, though, and hopefully
the ability to handle group files will be added again to a release
in the not-too-distant future.
Going Further
You can also find some documentation at the following URLs:
- <URL:
http://www.w3c.org/ > (look
for the archives of the www-talk mailing list)
- <URL:
http://www.apache.org/docs/mod/mod_access.html >
- <URL:
http://www.apache.org/docs/mod/mod_auth.html >
- <URL:
http://www.apache.org/docs/mod/mod_auth_db.html >
- <URL:
http://www.apache.org/docs/mod/mod_auth_dbm.html >
- <URL:
http://www.apache.org/docs/mod/mod_auth_digest.html >
- <URL:
http://www.apache.org/docs/mod/core.html >
(see the Satisfy and Require directives)
- <URL:
http://modules.apache.org/ >
(the Apache modules registry, for third-party modules
Conclusion
Apache provides a rich set of control mechanisms for protecting
Web pages, and continues to track emerging standards, such as the Digest
Authentication one, very closely. With care and a little creativity, you
should be able to easily apply whatever protections you want to your Web
site.
Got a Topic You Want Covered?
If you have a particular Apache-related topic that you'd
like covered in a future article in this column, please let me know; drop
me an email at <coar@Apache.Org>.
I do read and answer my email, usually within a few hours (although
a few days may pass if I'm travelling or my mail volume is 'way up). If
I don't respond within what seems to be a reasonable amount of time, feel
free to ping me again.
|